Viewing a web page with a Telnet program,
and how this changes from HTTP 1.0 to HTTP 1.1

You can view websites with a Telnet program. This is actually quite simple to
do. All you have to do is use the Telnet program to connect to the web server
on the port that the HTTP service is running on. The HTTP standard is port
80, but this can be configured to any valid port by the server admin.

Once you are connected to the web server, all you need to type is a line like
the following:

GET / HTTP/1.0

Note that after typing this line, you must press ENTER twice, because a CR/LF
after a blank line is a signal to the web server that the client's request is
complete.

The forward slash in the middle of the command indicates that you are
requesting the "root" of the website; remember that standard Unix practice is
to refer to a root directory with a forward slash, and to use the same slash
as a directory name divider. You could modify this same line to request, for
example, a file called "foo.htm" in a folder called "htdocs" on the server:

GET /htdocs/foo.htm HTTP/1.0

All of this seems pretty simple, and it is, but you may have tried this same
thing with HTTP version 1.1, and found that suddenly your requests are
getting rejected by the web server as HTTP 400 ("Bad Request") errors. The
reason for this, as you might have guessed, is that HTTP 1.1 has a specific
requirement that the HTTP 1.0 standard didn't have: All HTTP 1.1 requests
must have a "Host" header field.

As it turns out, adding a Host field isn't too hard to do. It just means
adding a second line to your GET request, so the request ends up looking
like this:

GET / HTTP/1.1
Host: www.servername.com

The reason this was made a requirement is because many web servers serve more
than one domain name. For example, you could own two websites, located at
www.somewhere.com and www.somewhereelse.com; these sites could contain
completely different content, but be running on the same server. If this
happens, then if you simply request the "root" of something, the server has
no idea which host you're actually trying to access.

The following is a direct quote from RFC 2616, the RFC which defines HTTP
1.1:

"A client MUST include a Host header field in all HTTP/1.1 request messages
... All Internet-based HTTP/1.1 servers MUST respond with a 400 (Bad Request)
status code to any HTTP/1.1 request message which lacks a Host header field."

...This explains why those servers give HTTP 400 errors to "Host"less
requests! They need to do this to be RFC-compliant.

The "Host" field is the only required field that was added from HTTP 1.0 to
HTTP 1.1, so when you're getting web pages with a Telnet program, this simply
increases the number of lines you must type from 1 to 2. Not too bad.

Note that some web servers may let you get away with sending an HTTP 1.1
request without a Host field (although this is obviously non-standard) if you
specify the host name in the GET line. For example, you could type this:

GET www.servername.com/ HTTP/1.1

Many servers probably ignore this violation of standards because RFC 2616
also declares that "If Request-URI is an absolute URI, the host is part of
the Request-URI. Any Host header field value in the request MUST be ignored."
That is to say, if the host name is already included in the URI line, then
the actual Host header field is ignored. So you might still be able to view
websites with single-line requests, although those requests will still end up
being somewhat longer with HTTP 1.1 than HTTP 1.0.
